Feat/litellm passthrough #199 by endre82 · Pull Request #201 · rynfar/meridian

endre82 · 2026-03-31T20:19:36Z

feat: add LiteLLM passthrough adapter with x-litellm-* header detection

Add tsx as dev dependency and update supervisor to prefer bun > tsx > npx
Detect LiteLLM by user-agent header (litellm/) in addition to x-litellm- headers
Force stream=false for all LiteLLM requests (healthchecks don't send x-litellm-* headers)
Increase MAX_CONCURRENT_SESSIONS default from 10 to 50
Increase rate-limit retry attempts (2→3) and base delay (1s→2s) with exponential backoff
Allow rate-limit retry even after partial content was yielded
Add DEBUG_PROXY=true flag for detailed error diagnosis
Add prefersStreaming() to AgentAdapter interface (unused but available)

… rate-limit retries - Add tsx as dev dependency and update supervisor to prefer bun > tsx > npx - Detect LiteLLM by user-agent header (litellm/*) in addition to x-litellm-* headers - Force stream=false for all LiteLLM requests (healthchecks don't send x-litellm-* headers) - Increase MAX_CONCURRENT_SESSIONS default from 10 to 50 - Increase rate-limit retry attempts (2→3) and base delay (1s→2s) with exponential backoff - Allow rate-limit retry even after partial content was yielded - Add DEBUG_PROXY=true flag for detailed error diagnosis - Add prefersStreaming() to AgentAdapter interface (unused but available)

rynfar · 2026-04-01T16:47:10Z

Thanks for putting this together — the LiteLLM adapter concept is solid and we want to get it in. However the PR bundles several unrelated changes that need to be separated before we can merge anything.

What we're pulling out and merging separately:
The core LiteLLM adapter (adapters/passthrough.ts, detection in detect.ts, tests) is good and we'll get that in as a clean PR.

What we're not merging from this PR, and why:

MAX_CONCURRENT_SESSIONS 10→50 — You can already control this yourself via MERIDIAN_MAX_CONCURRENT env var, so no code change is needed. That said, if you're hitting concurrency limits with LiteLLM specifically, bumping this is a reasonable thing to try — just be aware higher values risk process instability since each SDK spawn is ~11MB. We've noted this in the LiteLLM docs we're adding in the clean PR.

Rate-limit retry after partial content — The original guard ("Never retry after response content was yielded — response is committed") was intentional. Removing it for rate-limit errors risks corrupting or duplicating an in-flight SSE stream. The client has already received partial events — resuming on the same connection isn't safe. This needs more careful thought as a standalone change.

tsx dev dependency — Adds 500+ lines to the lock file for a dev convenience script. The project already has bun run ./bin/cli.ts for development. Not needed.

DEBUG_PROXY env var — The project already has CLAUDE_PROXY_DEBUG routed through claudeLog. Adding a second debug mechanism that writes raw console.error directly is inconsistent, and one of the debug lines isn't guarded by the flag at all (it fires on every rate limit for every user). We'll incorporate the useful cache visibility into the existing debug mechanism instead.

prefersStreaming() on the adapter interface — Added but never called in server.ts. The stream override is done by duplicating the LiteLLM header detection inline in server.ts instead. We'll wire it up properly so the adapter controls its own streaming preference.

We'll post the clean PR shortly and reference this one.

@endre82

Auto-detects LiteLLM requests via litellm/* User-Agent or x-litellm-* headers and routes them to a dedicated passthrough adapter. - adapters/passthrough.ts: LiteLLM adapter — usesPassthrough()=true, prefersStreaming()=false, x-litellm-session-id for session continuity, <env cwd=...> extraction, mcp__litellm__* tool naming - adapters/detect.ts: isLiteLLMRequest() detection, passthrough adapter wired in as priority 3 (after Droid and Crush, before OpenCode fallback) - adapter.ts: add optional prefersStreaming(body) to AgentAdapter interface - server.ts: move detectAdapter before stream determination; use adapter.prefersStreaming?.(body) to allow adapters to override stream setting (replaces the previous inline LiteLLM header duplication) - proxy-litellm-adapter.test.ts: 32 new tests covering adapter behaviour and detectAdapter routing - adapter-detection.test.ts: fix header() mock to handle no-arg call (isLiteLLMRequest calls header() with no args to inspect all headers) - README.md: LiteLLM setup section, tested agents table entry, passthrough.ts in architecture module map Closes #199. Based on original work in PR #201 by @endre82.

@endre82

Auto-detects LiteLLM requests via litellm/* User-Agent or x-litellm-* headers and routes them to a dedicated passthrough adapter. - adapters/passthrough.ts: LiteLLM adapter — usesPassthrough()=true, prefersStreaming()=false, x-litellm-session-id for session continuity, <env cwd=...> extraction, mcp__litellm__* tool naming - adapters/detect.ts: isLiteLLMRequest() detection, passthrough adapter wired in as priority 3 (after Droid and Crush, before OpenCode fallback) - adapter.ts: add optional prefersStreaming(body) to AgentAdapter interface - server.ts: move detectAdapter before stream determination; use adapter.prefersStreaming?.(body) to allow adapters to override stream setting (replaces the previous inline LiteLLM header duplication) - proxy-litellm-adapter.test.ts: 32 new tests covering adapter behaviour and detectAdapter routing - adapter-detection.test.ts: fix header() mock to handle no-arg call (isLiteLLMRequest calls header() with no args to inspect all headers) - README.md: LiteLLM setup section, tested agents table entry, passthrough.ts in architecture module map Closes #199. Based on original work in PR #201 by @endre82.

rynfar · 2026-04-01T17:34:32Z

Closing this out now that the LiteLLM adapter has landed in #215 (shipped in v1.24.0). The core work here — adapter detection, passthrough mode, session continuity via x-litellm-session-id — is all in. Thanks again @endre82, the original PR was a solid foundation.

endre82 added 2 commits March 31, 2026 10:41

feat: add LiteLLM passthrough adapter with x-litellm-* header detection

1507c41

rynfar mentioned this pull request Apr 1, 2026

feat: add LiteLLM passthrough adapter #215

Merged

rynfar closed this Apr 1, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/litellm passthrough #199#201

Feat/litellm passthrough #199#201
endre82 wants to merge 2 commits intorynfar:mainfrom
endre82:feat/litellm-passthrough

endre82 commented Mar 31, 2026

Uh oh!

rynfar commented Apr 1, 2026

Uh oh!

rynfar commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

endre82 commented Mar 31, 2026

Uh oh!

rynfar commented Apr 1, 2026

Uh oh!

rynfar commented Apr 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants